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cDNA clones mapping within the first 2601 bases of the 3’ end of the porcine transmissible gastroenteritis corona- 
virus (TGEV) genome were sequenced by the method of Maxam and Gilbert and an open reading frame yielding a 
protein having properties of the matrix (M or E1) protein was identified. It is positioned at the 5’ side of the nucleocap- 
sid (N) gene from which it is separated by an intergenic stretch of 12 bases. The deduced M protein comprises 262 
amino acids, has a molecular weight of 29,544, is moderately hydrophobic, and has a net charge of +7 at neutral pH. 
Thirty-four percent of its amino acid sequence is homologous with the M protein of the bovine coronavirus (BCV), 32% 
with that of the mouse hepatitis coronavirus (MHV), and 19% with that of the avian infectious bronchitis coronavirus 
(IBV). Judging from alignment with the BCV, MHV, and IBV M proteins, the amino terminus of the TGEV M protein 
extends 54 amino acids from the virion envelope which compares with only 28 for BCV, 26 for MHV, and 21 for IBV. 
Eleven of the sixteen amino-terminal amino acids are hydrophobic and the positions of charged amino acids around 
this sequence suggest that the first 16 amino acids comprise a potentially cleavable signal peptide for membrane 
insertion. A similar sequence is not found in the M proteins of BCV, MHV, or (BV. When mRNA from infected cells, or 
RNA prepared by /n vitro transcription of the reconstructed M gene, was translated in vitro in the presence of 
microsomes, the M protein became translocated and glycosylated. When a protein without the amino-terminal signal 
peptide was made by translating a truncated version of the M gene transcript, some translocation and glycosylation 
also occurred suggesting that the amino-terminal signal peptide on the TGEV M protein is not an absolute requirement 
for membrane translocation. Interestingly, the amino-terminal peptide did not appear to be cleaved during /n vitro 
translation in the presence of microsomes suggesting that a step in virion assembly may be required for proper 


exposure of the cleavage site to the signal peptidase. © 1988 Academic Press, Inc. 


INTRODUCTION 


The porcine transmissible gastroenteritis corona- 
virus (TGEV) comprises three major structural pro- 
teins: an internal nucleocapsid phosphoprotein (N) of 
43 kDa and two glycosylated envelope proteins, one of 
29 kDa (a matrix-like protein, M or E1) and one of 200 
kDa (the peplomeric, P, or E2 protein) (Brian et a/., 
1983; Garwes and Pocock, 1975; Kapke and Brian, 
1986; Wesley and Woods, 1986). While the 200-kDa P 
glycoprotein is demonstrably important in stimulating 
neutralizing antibody (Garwes et a/., 1978), the 29-kDa 
M glycoprotein may also be important, especially if 
complement is part of the virus—antibody reaction 
(Woods et a/., 1987). 

To investigate the role of individual viral proteins in 
virus replication and in induction of immunity, we have 
prepared cDNA clones beginning from the polyadenyl- 
ated 3' end of the TGEV genome and examined the 
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sequences of potential genes (Kapke and Brian, 1986). 
Within the first (3’) 2000 bases, we deduced, from an 
examination of open reading frames, a noncoding re- 
gion of 276 bases, and genes for a 9101 mol wt hypo- 
thetical hydrophobic polypeptide, a 43,426 mol wt nu- 
cleocapsid protein, and part of a matrix protein, ar- 
ranged in that order from the 3’ end of the genome. 
Assuming that a conserved intergenic sequence 
would be found in TGEV as has been found in the 
mouse hepatitis coronavirus (MHV) (Budztlowicz et a/., 
1985), and the avian infectious bronchitis coronavirus 
(IBV) (Brown and Boursnell, 1984), we prepared a syn- 
thetic oligodeoxynucleotide that is complementary to 
the TGEV intergenic sequence and used it as a primer 
for first-strand DNA synthesis in the preparation of ad- 
ditional genomic cDNA clones. Several cDNA clones 
were thus prepared and seven that mapped within the 
first (3’} 2601 bases were sequenced in part and an- 
other clone was sequenced completely to derive a 
potential gene sequence for the M protein. The nu- 
cleotide sequence predicted an M protein that shared 
many features with the M proteins of the mouse hepa- 
titis virus, the bovine coronavirus, and the avian infec- 
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tious bronchitis virus, but also predicted an unex- 
pected, potentially cleavable amino-terminal signal 
peptide that makes the TGEV M protein strikingly dif- 
ferent. In this study, we have confirmed our preliminary 
report of the M gene sequence (Kapke et a/., 1987), 
and have examined the behavior of the amino-terminal 
peptide during synthesis of the M protein. 


MATERIALS AND METHODS 
Cells and virus 


The Purdue strain of TGEV was grown on swine 
testicle (ST) cells as previously described (Kapke and 
Brian, 1986). 


cDNA cloning of TGEV genomic RNA 


Virus genomic RNA was prepared as previously de- 
scribed (Kapke and Brian, 1986). cDNA cloning was 
accomplished by the method of Gubler and Hoffman 
(1983) essentially as described (Kapke and Brian, 
1986) except that the. synthetic oligodeoxynucleotide 
5’T TAGAAGT T TAGT TA3’ was used as primer for 
first-strand cDNA synthesis. The primer was synthe- 
sized by the phosphoramadite method and was puri- 
fied by polyacrylamide gel electrophoresis. Clones 
were selected by colony hybridization to random- 
primed cDNA prepared from size-selected genomic 
RNA (Kapke and Brian, 1986). Clones were initially 
physically mapped to obtain their approximate position 
on the genome by using a matrix cross-hybridization 
method in which plasmid DNA from individual clones 
was probed with purified inserts or segments of inserts 
that had been radiolabeled with *P by nick-translation. 


DNA sequencing and sequence analyses 


DNA sequencing was done by the chemical method 
of Maxam and Gilbert (1980) and sequence analyses 
were done with the aid of the computer programs de- 
veloped by Queen and Korn (1984) and marketed as 
part of the Beckman Microgenie system, October 
1986 version (Beckman Instruments, Inc.). For protein 
homology determinations among the coronavirus M 
proteins, the ‘‘protein alignment’’ program of the 
Beckman Microgenie was used with MAXDIST set at 
1000. For protein homology searches among signal 
peptides, the “‘IFIND’’ program of the Intelligenetics 
system (Intelligenetics, Mountain View, CA) was used 
to search the National Biomedical Research Founda- 
tion Protein database. 


Reconstruction of the M gene and synthesis 
of transcripts in vitro 


To reconstruct the full-length M gene, clones FT36 
and C4 were digested with Accl and the small frag- 


ment of FT36 (which contained the first 782 bases of 
the sequence shown in Fig. 2) and the large fragment 
of C4 (which contained bases 783 through 934, an 
oligo(dC) tail of 13 bases, and the rest of the pUC9 
sequence) were ligated and used to transform Esche- 
richia coli strain 294. The insert was removed from this 
plasmid using Pstl, digested with Bsp1286 to remove 
the 5’ 114 bases, and blunt-ended with mung bean 
nuclease. The fragment now contained 13 bases up- 
stream (5’-ward) from the CTAAAC presumed inter- 
genic sequence (or 22 bases upstream from the pre- 
sumed M gene start codon) and extended to the end 
of the 13-base C-tail which begins 9 bases down- 
stream from the M gene stop codon (bases 922-924 
in Fig. 2). The blunt-ended fragment was ligated into 
the Smal site of the pGEM3 vector (Promega Biotec) 
and the orientation yielding a sense-strand RNA by 
transcription with SP6 polymerase was chosen. The 
construct was designated pGEM-M-1. Transcripts 
were prepared from EcoR!|-cut plasmid using the SP6 
transcription system marketed by Promega Biotec. 
To reconstruct the truncated version of the M gene 
(i.e., the M gene with no N-terminal signal peptide), 
clone pGEM-M-1 was cut with SphAl, which cuts within 
the multiple cloning region of the pGEM vector and 
between bases 174 and 175 in Fig. 2, and the result- 
ing large fragment was isolated, religated, and desig- 
nated pGEM-M-2. Translation of pGEM-M-2 tran- 
scripts allows initiation at the second AUG down- 
stream from the CTAAAC intergenic sequence 
(beginning at base 200 in Fig. 2), i.e., at the fifth amino 
acid downstream from the potential peptidase cleav- 
age site between glycine and lysine (von Heijne, 1986). 
The sequence at the 5’ junction of the insert (virus 
sense) and the vector were confirmed by sequencing 
for both the pGEM-M-1 and pGEM-M-2 constructs. 


lsolation of viral MRNA 


Cells were grown to confluency in 850-cm? roller 
bottles (Falcon) and infected with a multiplicity of in- 
fection of approximately 10. At 6 hr p.i., cells were 
rinsed twice with Earl's balanced salt solution (EBSS), 
scraped from the bottle, transferred to a 50-ml conical 
tip polypropylene tube, and pelleted. Cells from five 
roller bottles yielded a pellet of approximately 5 ml and 
constituted one batch for RNA extraction. Cells were 
lysed by the addition of 5 vol (25 ml to a 5-ml cell pellet) 
of lysis buffer containing 10 mM Tris-HCl (pH 7.0), 10 
mM NaCl, 5 mM MgCl, 1% NP-40(v/v) at room tem- 
perature followed by vigorous vortexing for 10 sec. 
Nuclei were removed immediately by centrifugation at 
4K X g, 5 min, and to the supernatant was added 0.2 vol 
of 10% SDS(w/) in water and 4 mg proteinase K crys- 
tals. The solution was incubated 30 min at 37° and 
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extracted with an equal volume of phenol-chioro- 
form—isoamyl alcohol (24:24:1), and RNA was precipi- 
tated with 2.2 vol of ethanol after adding 0.1 vol 3 M 
Na acetate. RNA from a 5-ml cell pellet was dissolved 
in water and polyadenylated RNA was selected by 
oligo(dT)-cellulose chromatography using a binding 
buffer of 0.6 M NaCl, 0.1 M Tris-HCl (pH 7.5), 0.2% 
SDS (w/v), and an elution buffer of 0.1 M Tris-HCl (pH 
7.5), 0.2% SDS (w/v). Polyadenylated RNA was eth- 
anol precipitated with Na acetate, dissolved in 100 ul 
water, ethanol precipitated without salt, dissolved in 
100 wl water, and 5 ul of this solution was used in a 
50-yl translation reaction. 


In vitro translation 


In vitro translation was done using a wheat germ 
system (Amersham) in a 50 wl reaction volume that 
contained 25 ul wheat germ extract, 1 nl 1 mM amino 
acid solution deficient in methionine (Promega Biotec), 
3 ul 1 M KAc (to make a final K* concentration of 1.17 
mM), 2 wl RNasin (Promega Biotec), 4 ul microsomes 
(Amersham, Promega Biotec, or as a gift from Dr. 
Peter Walter, University of California School of Medi- 
cine, San Francisco) or 4 ul microsome blank solution 
when microsomes were left out, 5 ul 300 nM octan- 
oyl-asparagine—leucine-threonine or 5 ul water when 
the tripeptide was left out, 5 ul [°°S]-methionine (>800 
Ci/mmol, New England Nuclear), 5 ul RNA. Octanoyl— 
asparagine—leucine-threonine, a competitive inhibitor 
of N-linked glycosylation (Lau et a/., 1983), was a kind 
gift from Dr. Fred Naider, City University of New York, 
and was prepared as a 300 wl stock in an aqueous 
solution containing 25% dimethyl sulfoxide. Transla- 
tions were done for 1 hr at 25°. Sodium carbonate 
treatment of microsomes followed the procedure of 
Fujiki et a/. (1982). Deglycosylation of translation prod- 
ucts was done with W-glycanase (Genzyme Corp.) or 
with endoglycosidase H (ICN) using methods recom- 
mended by the manufacturers. Immunoprecipitates 
were prepared as described by Anderson and Blobel 
(1983) except that iodoacetamide was not used to 
block SH groups prior to electrophoresis. Five micro- 
liters porcine hyperimmune TGEV-specific serum was 
used per 50 ul translation volume, and protein A- 
Sepharose CL-4B (Pharmacia) was used to adsorb the 
immunoprecipitates. Porcine hyperimmune anti-TGEV 
serum was produced in a specific pathogen-free gilt 
(gilt 53) and was a kind gift from Dr. Lorant Kemeny, 
National Animal Disease Center (Ames, IA) Kemeny, 
1976). Preprolactin mRNA was generated with SP6 
polymerase from cloned cDNA kindly given to us by 
Drs. William Hansen and Peter Walter, University of 
California School of Medicine (San Francisco, CA) 
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Fic. 1. Sequencing strategy used to derive the TGEV M gene 
sequence. cDNA clones FG5, C4, F5, E2, FT36, FT35, and FT44 
were cloned into the Pstl site of vector pUC9 and were all found to 
be in the same orientation with respect to the virus genomic RNA 
illustrated at the top of the figure. That is, the 3’ end of the insert is 
near the Hindll site in the multiple cloning region, and the 5’ end is 
near the Sa/l site in the multiple cloning region. FT43 was likewise 
cloned but was found to be in the opposite orientation. Nucleotide 
position 1 on the restriction map sequence is the first base at the 5’ 
end (virus-sense) of the FT36 insert. O and L indicate sites labeled 
on fragments of clone FG5 at the 3’ end of DNA with reverse tran- 
scriptase and at the 5’ end with polynucleotide kinase, respectively. 
@ indicates 3’ end-labeling with reverse transcriptase at the Sa/I site 
in the multiple cloning region of clones C4, F5, E2, and FT36. = 
indicates 3’ end-labeling with reverse transcriptase at the Hindlll site 
in the multiple cloning region of clones C4, F5, E2, FT36, and FT35. 
@ indicates 3’ end-labeling with reverse transcriptase at the Xholl 
site in clones E2 and FT43, or at the Hinfl site on clone FT44. 


and 6-lactamase mRNA was obtained from Promega 
Biotec. 


In vivo labeling 


For labeling intracellular M protein, ST cells in 
60-mm plastic petri dishes were infected with a multi- 
plicity of infection of approximately 10, incubated 1 hr, 
and refed, after rinsing, with 10 ml per dish of minimum 
essential medium containing 5% normal methionine 
concentration, 10% fetal calf serum (Sterile Systems), 
and 200 pCi [*°S]methionine (Translabel, ICN). Where 
indicated, tunicamycin (Sigma) at a final concentration 
of 2 ug/ml was included in the medium used for re- 
feeding. At 6 hr p.i., cells were rinsed with EBSS, 
scraped into a 15-ml conical tip polypropylene tube 
and pelleted, and lysate was prepared by adding 0.5 
ml phosphate-buffered saline, 1% NP-40, 10 units 
Aprotinin (Sigma)/ml, and incubating the mixture at 
25° for 30 min with frequent vortexing. Nuclei and cell 
debris were removed by centrifugation at 13,000 g for 
5 min and 50 ul cell lysate supernatant was used in an 
immunoprecipitation reaction as described above for 
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30 60 90 120 
CTATACATGGTGTGTTGCAATTTAGGAAGGACAGTTATTATTGTTCCAGCGCAACATGCTTACGATGCCTATAAGAATTTTATGCGAATTAAAGCATACAACCCCGATGGAGCACTCCTT 
M LT M P IRI LCE LT K H T T PM EE BS GL 


% 150 180 210 240 
GCTTGAACTAAACAAAATGAAGATTTTGTTAATATTAGCGTGTGTGATTGCATGCGCATGTGGAGAACGCTATTGTGCTATGAAATCCGATACAGATTIGTCATGTCGCAATAGTACAGC 
L EL N K M K I LubdIé obaA cv tac Ac GE RY C AM K S DT DLS CRN STA 


270 300 330 360 
GTCTGATTGTGAGTCATGCTTCAACGGAGGCCATCTTATITGGCATCTTGCAAACTGGAACTTCAGCTGGTCTATAATATTGATCGTTTTTATAACTGTGCTACAATATGGAAGACCTCA 
S$ Dc ES C F N GG DL I W HL AN WN F § WS TIT Lb IVY FIT Vv ob eg ¥ GR P Q 


390 420 450 480 
ATTCAGCTGGTTCGCGTATGGCATTAAAATGCTTATAATGTGGCTATTATGGCCCGTTGTTTTGGCTCTTACGATTTTTAATGCATACTCGGAATACCAAGTGTCCAGATATGTAATGTT 
F S WF A ¥ GI K MOLI M WLOwWwWePePvvLA LT I FNA Y S EY Qvs RYVMF 


510 540 570 600 
CGGCTTTAGTATTGCAGGTGCAATIGTTACATTTGTACTCTGGATTATGTATTTTGTAAGATCCATTCAGTTGTACAGAAGGACTAAGTCTTGGTGGTCTTTCAACCCTGAAACTAAAGC 
GF S§ I AGaA IV T FV GW tM Y F v RS IQL Y RRTK §S WW S FPNPETKA 


630 660 690 720 
AATTCTTTGCGTTAGTGCATTAGGAAGAAGCTATGTGCTTCCTCTCGAAGGTGTGCCAACTGGTGTCACTCTAACTTTGCTTTCAGGGAATTTGTACGCTGAAGGGTTCAAAATTGCAGG 
It ¢ Vv S A LG ¥ V L PLE Gv PT GvYTLT LL s NL Y A EG F K IAG 


750 780 810 840 
TGGTATGAACATCGACAATTTACCAAAATACGTAATGGTTGCATTACCTAGCAGGACTATTGTCTACACACTTGTTGGCAAGAAGTTGAAAGCAAGTAGTGCGACTGGATGGGCTTACTA 


G MN I DN L PK ¥Y VM VAL PS RT 


870 


Vv e¥Y T LV GK K LK AS SAT GWA Y Y 


930 960 


TGTAAAATCTAAAGCTGGTGATTACTCAACAGAGGCAAGAACTGATAATTTGAGTGAGCAAGAAAAATTATTACATATGGTATAACTAAACTTCTAAATGGCCAACCAGGGACAACGTOT 


VK S&S K AGOD YS T EAR TODNULS EQER LLAMY 


990 
CAGTTGGGGAGATGAATCTACCAAAACACGTGGTCGTTCC 
S W GD Es T K T R GR S 


MANQGQRVYV 


Fic. 2. Nucleotide sequence of the TGEV M gene and deduced amino acid sequence for the protein. The nucleotide sequence comes from 
the part of the virus genome illustrated in Fig. 1. A continuous open reading frame beginning at nucleotide position 56 and continuing through 
nucleotide 922 is identified. The CTAAAC intergenic sequences are underlined. The proposed amino terminus for the M protein is identified by 


an underlined methionine residue near base position 137. 


products of a 50 ul in vitro translation reaction. The M 
protein was immunoprecipitated with 10 yl M-specific 
monoclonal antiserum (identified as 1A6; Woods et a/., 
1987). 

For labeling virion M protein, cells grown in 150-cm? 
plastic flasks were infected as described above, and 
incubated with 500 pCi [S]methionine (Translabel, 
ICN) per flask. At 18 hr p.i., virus was purified from 
clarified supernatant fluids by isopycnic sedimentation 
in sucrose gradients as previously described (Brian et 
al., 1980). Virion proteins were solubilized in 4% SDS, 
and M protein was immunoprecipitated with M-spe- 
cific monoclonal antibody and deglycosylated with N- 
glycanase. 


Polyacrylamide ge! electrophoresis 


In vitro translation reaction products or immunopre- 
cipitates on protein A-Sepharose CL-4B beads were 
diluted with an equal volume of 2X Laemmli sample 
treatment buffer [1X sample treatment buffer is 0.0625 
M Tris-HCl (0H 6.8), 2% SDS, 10% glycerol, 5% 2- 
mercaptoethanol] that contained 5 M urea, heated 2 
min at 100°, and electrophoresed using the method of 
Laemmli (1970). 


RESULTS 
Deduced amino acid sequence of the matrix protein 


Seven clones, C4, E2, F5, FT35, FT36, FT43, and 
FT44, mapping in the positions illustrated in Fig. 1, 
were sequenced in part to extend the TGEV genomic 
sequence that was known from clones FG5 and J21 
(Kapke and Brian, 1986). Clone FG5 maps at the ex- 
treme 3’ end of the genome and contains the se- 
quence for the hypothetical hydrophobic protein gene, 
the N gene, and part of the M gene. Identification of 
the third open reading frame as the M gene sequence 
was based on regions of extensive amino acid homol- 
ogy with the M proteins of MHV (Armstrong et a/., 
1984) and IBV (Boursnell et a/., 1984). The sequencing 
Strategy we used is described in Fig. 1. 

The molecular weight of the glycosylated M protein 
has been estimated from electrophoretic migration 
patterns to be approximately 28 to 30 kDa (Brian et a/., 
1983; Garwes and Pocock, 1975; Wesley and Woods, 
1986). We therefore anticipated that we would be able 
to deduce from the gene sequence a molecular weight 
of 28 kDa or less for the unglycosylated protein. Sur- 
prisingly, the completed open reading frame of what 
we had identified earlier as part of the M gene (Kapke 
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and Brian, 1986) yielded a protein of 289 amino acids (Armstrong et a/., 1984; Tooze et a/., 1984) and bovine 
having a molecular weight of greater than 32,000 (Fig. coronavirus (Lapps et a/., 1987) migrate in sodium do- 
2). Since the M proteins of the mouse hepatitis virus decyl sulfate-containing polyacrylamide gels with elec- 
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Fig. 3. Translocation and glycosylation of the M protein with and without its amino-terminal peptide. (A) Comparison of M protein from /n vitro 
and in vivo synthesis and from the virion. Lanes 1 through 12, mRNAs isolated from uninfected (U) or infected (|) cells, and in vitro transcripts 
prepared from the reconstructed full-length M gene (pGEM-M-1 or 1st AUG) or the truncated M gene (p-GEM-M-2 or 2nd AUG), were translated 
in vitro and treated as noted. For immunoprecipitation, porcine hyperimmune antiserum was used. Lanes 13 and 14, M protein was immunopre- 
cipitated from cell lysates with monoclonal antibody. Cell lysates were prepared as noted. Lanes 15 and 16, M protein was immunoprecipitated 
from virion proteins with monoclonal antibody. Immunoprecipitates were treated as noted. Lane 17, proteins from purified TGEV were 
electrophoresed and the position of the 28-kDa protein is identified. Lanes 16 and 17 are a longer exposure of part of lanes 1 and 2 used in (E). 
Lanes 1 through 4, 13, and 14 were overexposed to identify species in relatively lower abundance. (B) Endoglycosidase H treatment of the in 
vitro translation products of the reconstructed full-length M gene (1st AUG) and of the truncated M gene (2nd AUG). (C) Sodium carbonate 
treatment of microsomes containing the translation products of the reconstructed full-length M gene (1st AUG) and of the truncated M gene 
(2nd AUG). (D) Controls used for detecting signal peptidase activity in the microsomes employed. Positions of the uncleaved and cleaved forms 
of preprolacatin and 8-lactamase are shown. (E) lmmunoprecipitation showing the specificity of the monoclonal antibody 1A6. All gels were 
10% polyacrylamide. '*C-Radiolabeled protein molecular weight markers were obtained from New England Nuclear. 
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Fig. 4. Deduced amino acid sequences of four coronavirus M proteins aligned for maximum homology. The TGEV M protein of 262 amino 
acids (top row), BCV (Mebus strain) M protein of 230 amino acids (second row), MHV (A59 strain) M protein of 228 amino acids (third row), and 
IBV (Beaudette strain) M protein of 225 amino acids (fourth row) were aligned for maximum homology. For this, the 8 amino acid sequence 
SWWSENPE was important in making the initial alignment. !dentical amino acids among all four proteins are boxed with a solid line. The 21 
amino acid hydrophobic transmembrane regions of the MHV and IBV proteins are boxed with a hatched line (Rottier et a/., 1986), with the TGEV 
and BCV sequences drawn in paraliel with MHV. Potential N-glycosylation sites (for TGEV and IBV) and O-glycosylation sites (for BCV and MHV) 
are indicated with a solid circle above the amino acid. The signal peptide-like properties of the amino-terminal 16 amino acids is depicted. Within 
this region the 11 hydrophobic amino acids are underlined. The basic amino acids at positions 2 and 18 and the acidic amino acid at position 17 
are indicated. The potential site for peptidase cleavage, according to the rules of Von Heinje (1986), is indicated by an open triangle. The 
numbers 1 and 2 above the first and second methionine residues in the TGEV M sequence indicate initiation sites for translation of the in vitro 


transcripts derived from pGEM-M-1 and pGEM-M-2, respectively. 


trophoretically determined molecular weights that are 
11% less than their deduced molecular weight (23 vs 
26 for the mouse hepatitis and 22 vs 26 for the bovine 
coronavirus), an apparent reflection of their hydropho- 
bic nature, it is possible that the entire open reading 
frame identified above encodes the TGEV M protein. 
We hypothesized that this is unlikely, however, based 
on the documented evidence for leader-primed tran- 
scription in coronavirus replication (Makino et a/., 
1986b), and on the existence of a primer binding-like 
intergenic sequence early in the open reading frame. 
The most probable site for initiation of transcription of 
the M message is suggested by the sequence 
CTAAAC beginning at base 128 in Fig. 2, which is part 
of a conserved intergenic sequence in the TGEV ge- 
nome. It is a sequence found in total and again in part 
between the M and N genes beginning at base 926 in 
Fig. 2, and also between the N and hypothetical hy- 
drophobic protein genes (Kapke and Brian, 1986). It is 


also part of the intergenic sequence found in the MHV 
genome (Budzilowicz et a/., 1985). If CTAAAC func- 
tions as part of an intergenic sequence that directs 
leader-primed synthesis and thereby defines the start 
of the M transcript for TGEV, then the M protein coding 
sequence could start with the first available methio- 
nine 3’-ward of the CTAAAC sequence, a codon that 
begins at base 137 in Fig. 2. It could also start with the 
second, downstream, in-frame AUG codon beginning 
at base 200, but this is surrounded by a much less 
favorable Kozak consensus sequence (Kozak, 1983). 
To test our hypothesis, we prepared transcripts 
identical to the postulated functional MRNA structure 
(transcripts from the reconstructed full-length M gene, 
construct pGEM-M-1, that would initiate translation at 
the first AUG downstream from the CTAAAC se- 
quence) and compared sizes of the resulting transla- 
tion products with those of MRNA isolated from in- 
fected cells. Figure 3A, lanes 2 and 5, illustrates that 
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M protein translated in vitro from cell-derived, poly(A)- 
selected mRNA and immunoprecipitated with TGEV- 
specific antibody had an electrophoretically deter- 
mined molecular weight of 25K and comigrated with 
protein translated from the reconstructed full-length M 
gene. Protein translated from the reconstructed full- 
length M gene immunoprecipitated with the same an- 
tibody thus confirming its authenticity (Fig. 3A, lane 8). 
Furthermore, truncated M protein (generated from 
construct pGEM-M-2), although also antigenically au- 
thentic, migrated distinctly faster than full-length M but 
with a migration rate much less than expected for a 
2-kDa molecular weight difference (Fig. 3A, lanes 9 
and 12). The small difference in electrophoretic mobil- 
ity between the two forms of M again is an ostensible 
function of the hydrophobic nature of the protein. The 
electrophoretically determined molecular weight of the 
truncated M polypeptide is 24.5K. Judging from the 
size of the various translation products, it is unlikely 
that initiation of translation /n vivo starts at any place 
other than at the first AUG downstream from the 
CTAAAC intergenic sequence. 

When the first methionine codon downstream from 
the CTAAAC sequence is used as the initiation site for 
translation, the deduced M protein comprises 262 
amino acids and has a molecular weight of 29,544. It 
is moderately hydrophobic with 44% of its amino acids 
being hydrophobic, and is basic since it carries a net 
charge of +7 at neutral pH. 


A potentially cleavable amino-terminal signal 
peptide is not an absolute requirement for 
membrane translocation and glycosylation 

A comparison of the amino acid sequence for the M 
proteins of TGEV, BCV, MHV, and IBV (Fig. 4) reveals 
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Fic. 5. Proposed topography of the TGEV M protein. The pro- 
posed topography of the TGEV M protein with an uncleaved amino- 
terminal signal peptide is based on the topographical arrangement 
of the MHV M protein (Rottier ef a/., 1986). Our model predicts that 
the amino-terminal signal peptide aids in the translocation of an 
especially long amino terminus, but is not removed by signal pepti- 
dase until some step in virus assembly is achieved. The TGEV M 
protein, like the MHV M protein, contains internal signals for trans- 
location of the three internal transmembrane regions of the protein. 
The N-glycosylation site (y) and O-glycosylation site (9) on the pro- 
teins are shown. 


several features that are shared among all four viruses, 
but also one feature for TGEV that is strikingly con- 
trasting. Regions of high sequence homology are 
found among the proteins. Most notable is a stretch of 
8 amino acids beginning at position 132 on the TGEV 
sequence that is identical for all four viruses. By com- 
puter analysis using a protein alignment function, 34, 
32, and 19% of the TGEV protein sequence is homolo- 
gous with that of BCV, MHV, and IBV, respectively. 
Furthermore, a hydrophobicity plot of the TGEV M 
protein shows three internal hydrophobic domains 
that align with similar domains in BCV, MHV, and IBV 
(Fig. 4 and data not shown; Rottier et a/., 1984; 1986), 
BCV (Lapps et a/., 1987), and IBV (Boursnell et a/., 
1984). This suggests that the topology of the four pro- 
teins is similar; that is, from its entrance into the virion 
membrane and as it extends toward the carboxy termi- 
nus, the protein spans the membrane three times and 
has a relatively hydrophilic intravirion carboxy-terminal 
region (Rottier et a/., 1984, 1986). 

The striking feature of the TGEV M protein is its 
much longer amino terminus that includes a sequence 
resembling a cleavable peptide for membrane inser- 
tion. Assuming a parallel structure for the M proteins 
of the four viruses and assuming the MHV M protein 
enters the virion envelope at position 26 (Rottier et a/,, 
1986), then the external amino-terminal portion is 28 
amino acids for BCV, 21 for IBV, and 54 for TGEV. 
Within the first 54 amino acids there is one asparagine 
at position 32 that has the appropriate surrounding 
sequence required for N-linked glycosylation (Hubbard 
and lvatt, 1981), the kind of glycosylation shown for 
the TGEV M protein Uacobs et a/., 1986). Unlike the 
amino terminus of the MHV, BCV, and IBV proteins, 
the TGEV M protein is hydrophobic for the first 16 
amino acids. Eleven of the sixteen terminal amino 
acids are hydrophobic, and amino acids at positions 2, 
17, and 18 are charged (Fig. 4). By inspection, this 
sequence has the properties of a cleavable amino-ter- 
minal signal peptide and from the ‘‘—3, —1”’ rule (von 
Heijne, 1986) peptidase cleavage would occur be- 
tween amino acids 16 and 17. 

Since an amino-terminal peptide is not required for 
membrane translocation and glycosylation of the M 
proteins of BCV, MHV, and IBV, we examined what 
effect the peptide had on the translocation of the 
TGEV M protein. Both forms of the reconstructed M 
gene generate the asparagine glycosylation site when 
translated, DGEM-M-1 at amino acid position number 
32 and pGEM-M-2 at amino acid position number 11 
of their respective translation products (Fig. 4). Both 
forms of transcripts were therefore translated in the 
presence of microsomes known to have glycosylating 
activity. Figures 3A, lanes 5, 6, 9, and 10, 3B, lanes 1, 
2,3, and 4, and 3C, lanes 1, 2, 3, and 4, illustrate that, 
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whereas the full-length reconstructed M gene (1st 
AUG) yielded a 25-kDa protein that was mostly glyco- 
sylated, 65-88% as determined from optical density 
tracings, the truncated M gene (2nd AUG) yielded a 
protein that was also glycosylated but to a far lesser 
extent, only 10-30%. The glycosylated products of 
both the 1st AUG and 2nd AUG transcripts were ap- 
proximately 28 kDa (Fig. 3A, lanes 6 and 10), the same 
size as the glycosylated product from viral mRNA 
translation (Fig 3A, lane 3) and glycosylated virion M 
(Fig. 3A, lanes 16 and 17). Glycosylation was con- 
firmed to have taken place since the products from in 
vitro translation migrated again with the unglycosy- 
lated polypeptides after they had been digested with 
N-glycanase (Fig. 3A, lanes 7 and 11) or endoglycosi- 
dase H (Fig. 3B, lanes 5 and 6). These results demon- 
strate that although the asparagine glycosylation site 
of both the full-length and truncated forms of M protein 
became translocated to the lumenal side of the micro- 
some, translocation with the amino-terminal peptide 
present was far more efficient. Furthermore, translo- 
cation of the amino terminus appeared to be specifi- 
cally enhanced by the peptide since carbonate treat- 
ment showed both full-length and truncated forms of 
the protein as a whole to be equally membrane em- 
bedded (Fig. 3C, lanes 5, 6, 7, and 8), presumably as a 
result of internal translocation signals of the kind de- 
scribed for the MHV and IBV M proteins (Rottier et a/., 
1985; Machamer and Rose, 1987). Following carbon- 
ate treatment greater than 85% of both the full-length 
and truncated forms of the M protein remained mem- 
brane bound as determined by optical density tracing 
of the autoradiogram. 


Cleavage of the amino-terminal peptide did not 
appear to occur during in vitro translation 


To test for amino-terminal peptide cleavage, viral 
mRNA from infected cells and transcripts of the cloned 
M gene were translated in the presence of micro- 
somes known to contain signal peptidase activity (Fig. 
3D). In initial experiments, a tripeptide serving as a 
competitive inhibitor of asparagine-linked glycosyla- 
tion (Lau et a/., 1983) was incubated with the micro- 
somal-translation mixture in order to inhibit concurrent 
glycosylation that would otherwise obscure the results 
of peptide cleavage. Although inhibition of glycosyla- 
tion was never complete, at no time was there evi- 
dence of peptide cleavage (data not shown). To use a 
second approach, mRNA and transcripts were trans- 
lated in the presence of microsomes and the products 
were deglycosylated with either -glycanase or en- 
doglycosidase H, and sizes were compared by electro- 
phoresis (Figs. 3A, lanes 4, 7, and 11, and 3B, lanes 5 
and 6). Interestingly, the sizes of the deglycosylated 


products, as described above, appeared to be no 
smaller than the polypeptides synthesized without mi- 
crosomes. There appeared, therefore, to be no 
amino-terminal peptide removal during /n vitro transla- 
tion in the presence of microsomes. 

Results of experiments to determine whether the 
N-terminal peptide is cleaved jn vivo suggest that 
cleavage in vivo may be dependent upon the glycosyl- 
ation of M or other glycoproteins. When infected cells 
were incubated in the presence of tunicamycin and 
radiolabeled M was immunoprecipitated from cell ly- 
sate using monoclonal antibody, only one major band 
was found and it migrated as an uncleaved protein of 
25 kDa (Fig. 3A, lane 14). Similar results were obtained 
when polyclonal hyperimmune TGEV serum was used 
(data not shown). On the other hand, in the absence of 
tunicamycin, two forms of M were precipitated. These 
were a 28-kDa species, the size of fully glycosylated 
M, and a 24.5-kDa species, the size of M from which 
the N-terminal signal had been removed (Fig. 3A, lane 
13). Glycosylation in vivo may have important conse- 
quences on the exposure of the peptidase cleavage 
site because of interactions between M and other viral 
components in vivo. 

In light of a recent report demonstrating by amino- 
terminal amino acid sequencing that the virion form of 
TGEV M protein is indeed cleaved (Laude et a/., 1987), 
we included deglycosylated, radiolabeled virion M in 
our electrophoretic analysis (Fig. 3A, lane 15). The 
deglycosylated virion M did not migrate as a cleaved 
polypeptide but rather migrated with an apparent mo- 
lecular weight of approximately 26 kDa suggesting 
that it had perhaps undergone a second as yet unde- 
termined modification after becoming virion-asso- 
ciated. 


DISCUSSION 


The M protein of MHV was the first coronavirus M 
protein to be sequenced (Armstrong et a/., 1984) and 
its topography with regard to membrane orientation 
and insertion has been carefully documented (Rottier 
et al., 1984, 1986). It therefore serves as the prototy- 
pic coronavirus M protein to which others can be 
compared. The coronavirus M protein apparently 
functions to direct the budding of virus into the rough 
endoplasmic reticulum and the Golgi since it inserts 
into these membranes and is found at highest con- 
centrations there (Holmes et a/., 1984; Tooze et a/., 
1984). Presumably virus assembly is mediated 
through an interaction between the M protein (in the 
membrane) and the N protein (in the cytoplasm) or the 
RNA (in the cytoplasm), or both. The M protein of MHV 
does not have an amino-terminal cleavable signal 
peptide for membrane translocation, but rather utilizes 
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one or more of its three internal hydrophobic domains 
for membrane insertion (Rottier et a/., 1985). The same 
is also true for the IBV M protein (Machamer and Rose, 
1987), and is probably true for the BCV M protein 
(Lapps et a/., 1987) which has a structure similar to 
that of the MHV M protein. 

it came as a surprise to us, therefore, that the de- 
duced TGEV M protein has, in addition to the three 
internal hydrophobic domains, an amino-terminal se- 
quence that possesses the properties of a cleavable 
signal peptide for membrane insertion (von Heijne, 
1986). The potentially cleavable N-terminal peptide 
was also revealed by the work of Laude et a/. (1987) in 
which a nearly identical M gene sequence was re- 
ported for the same strain of TGEV. In their sequence, 
bases in positions 375, 468, and 720 of Fig. 2 were T, 
C, and A, respectively, making the corresponding 
amino acids at these codon positions Val, Asn, and 
Asp. Furthermore, they obtained direct evidence that 
the N-terminal signal peptide was cleaved from the 
virion-associated M protein. At least two fundamental 
questions are therefore raised by the existence of an 
amino-terminal signal peptide in the TGEV M protein: 
(i) What is the evolutionary origin of such a sequence, 
and (ii) what role does the sequence play for the TGEV 
M protein? 

With regard to the evolutionary origin of the hydro- 
phobic amino terminus, two possibilities can be enter- 
tained, assuming the four coronaviruses, TGEV, BCV, 
MHV, and IBV have a common evolutionary origin. (i) 
There was an amino-terminal hydrophobic sequence 
in the primordial protein which was lost during the 
evolution of BCV, MHV, and IBV since it is superfluous 
for membrane insertion and virus assembly. Mecha- 
nistically, the loss of a genetic sequence could be 
explained by the dissociating—reassociating polymer- 
ase hypothesized by Lai et a/. (Makino et a/., 1986a). 
The nucleotide sequence encoding the amino-termi- 
nal hydrophobic sequence could have been eliminated 
by the polymerase as it copied the negative-strand 
template. (ii) There was no amino-terminal hydropho- 
bic sequence in the primordial protein but the TGEV M 
protein acquired one during evolution. The polymerase 
during replication of the TGEV genome could have 
become dissociated and then reassociated with an- 
other minus-strand RNA template that carried the se- 
quence for a hydrophobic signal. While negative- 
strand RNA of this kind is probably nonexistent or of 
low abundance in eucaryotic cells normally, it does 
exist in cells coinfected with another RNA virus. Con- 
ceivably the TGEV M protein could have acquired its 
amino-terminal sequence by copying the negative 
strand of another RNA virus. In this regard, it is inter- 
esting that 6 of the first 8 amino acids in the TGEV M 


sequence are identical to the VSV G protein signal 
sequence (Rose and Gallione, 1981). 

With regard to the function of the amino-terminal 
hydrophobic sequence, we propose that it aids in the 
translocation of the amino terminus of the TGEV M 
protein through the endoplasmic reticulum as would 
other amino-terminal signal peptides, but it is not an 
absolute requirement. The TGEV M protein without its 
amino-terminal signal peptide, as approximated by the 
translation product of the pGEM-M-2 construct which 
is 4 amino acids shorter than the polypeptide identified 
by the peptidase cleavage site (Laude et a/., 1987), 
can translocate apparently by the use of an internal 
signal(s) of the type described for the MHV and IBV M 
proteins (Rottier et a/., 1985; Machamer and Rose, 
1987). It is interesting to note that even after removal 
of the amino-terminal peptide, the TGEV M protein 
extends 12 amino acids farther from the envelope than 
does BCV, 11 amino acids farther than MHV, and 16 
amino acids farther than IBV. Perhaps the amino-ter- 
minal signal peptide, while not being an absolute re- 
quirement for translocation of the TGEV M protein, 
aids enough in the translocation of the additional ex- 
ternal amino-terminal sequence that it was evolution- 
arily selected. 

Since the amino-terminal peptide is apparently not 
removed from the M protein during in vitro translation 
in the presence of microsomes, but is removed from 
the M protein in the assembled virion (Laude et a/., 
1987), we propose that its cleavage depends on the 
context in which it finds itself. Perhaps a step in virion 
assembly, for example, an interaction with another 
viral protein or viral RNA, may be required for cleavage 
to occur. Glycosylation of the M protein alone is ap- 
parently not a prerequisite for cleavage jn vitro (Figs. 
3A and B) but may be important in the jn vivo context. 
Context is important for the cleavability of other signal 
peptides. For example, under certain constraints the 
potentially cleavable signal on the invariant (ly) chain 
of class II histocompatibility antigens (Lipp and Dob- 
berstein, 1986) is not cleaved. We further propose that 
the M protein with an uncleaved signal peptide would 
have the orientation depicted in Fig. 5. With this orien- 
tation, the cleavage site is buried in the membrane and 
fewer than 37 amino acids are exposed as a loop on 
the lumenal side of the endoplasmic reticulum. Once 
virion assembly begins, the cleavage site becomes 
exposed to the signal peptidase, cleavage occurs, and 
37 amino acids, including a glycosylated asparagine, 
are left remaining on the virion surface. Because of the 
amino-terminal hydrophobic sequence, the TGEV M 
protein may behave differently than its MHV, BCV, or 
IBV counterparts with regard to intracellular trafficking. 
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